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Abstract 

Background: CRISPR/Cas (Clustered Regularly Interspaced Short Palindromic Repeats/CRISPR associated sequences) 
is a recently discovered prokaryotic defense system against foreign DNA, including viruses and plasmids. CRISPR 
cassette is transcribed as a continuous transcript (pre-crRNA), which is processed by Cas proteins into small RNA 
molecules (crRNAs) that are responsible for defense against invading viruses. Experiments in E. coli report that 
overexpression of cas genes generates a large number of crRNAs, from only few pre-crRNAs. 

Results: We here develop a minimal model of CRISPR processing, which we parameterize based on available 
experimental data. From the model, we show that the system can generate a large amount of crRNAs, based on 
only a small decrease in the amount of pre-crRNAs. The relationship between the decrease of pre-crRNAs and the 
increase of crRNAs corresponds to strong linear amplification. Interestingly, this strong amplification crucially 
depends on fast non-specific degradation of pre-crRNA by an unidentified nuclease. We show that overexpression 
of cos genes above a certain level does not result in further increase of crRNA, but that this saturation can be 
relieved if the rate of CRISPR transcription is increased. We furthermore show that a small increase of CRISPR 
transcription rate can substantially decrease the extent of cas gene activation necessary to achieve a desired 
amount of crRNA. 

Conclusions: The simple mathematical model developed here is able to explain existing experimental observations 
on CRISPR transcript processing in Escherichia coli. The model shows that a competition between specific pre-crRNA 
processing and non-specific degradation determines the steady-state levels of crRNA and is responsible for strong 
linear amplification of crRNAs when cas genes are overexpressed. The model further shows how disappearance of 
only a few pre-crRNA molecules normally present in the cell can lead to a large (two orders of magnitude) increase 
of crRNAs upon cas overexpression. A crucial ingredient of this large increase is fast non-specific degradation by an 
unspecified nuclease, which suggests that a yet unidentified nuclease(s) is a major control element of CRISPR 
response. Transcriptional regulation may be another important control mechanism, as it can either increase the 
amount of generated pre-crRNA, or alter the level of cas gene activity. 
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Background 

CRISPR (Clustered Regularly Interspaced Short Palin- 
dromic Repeats) cassettes are present in almost every 
known archaeal genome and in about half of the known 
bacterial genomes [1-3]. A CRISPR cassette consists of 
identical direct repeats of about 30 bp in length, inter- 
spaced with spacers of similar length [4]. The length of 
different spacers within the same cassette is the same, 
while sequences of these spacers are different. In many 
organisms, these spacer sequences closely match sequences 
of bacteriophages (bacterial viruses) infecting this or closely 
related organisms [5-7]. It was recently discovered that 
CRISPR/Cas loci function as an adaptive immunity system, 
which is responsible for defending prokaryotic cell against 
viruses and plasmids [8,9]. A match between a CRISPR spa- 
cer and sequence in invading DNA provides immunity to 
infection [5-9]. 

In E. colU promoters that transcribe CRISPR cassettes 
and cas genes are distinct, and are (at least under nor- 
mal growth conditions) considered to be poorly active 
due to repression by H-NS transcription factor [10]. The 
entire CRISPR cassette is transcribed as a long continu- 
ous transcript [10,11], which is then processed by one of 
the Cas proteins (CasE), to small RNA molecules 
(crRNAs) [11,12]. Once crRNAs are generated, they bind 
a large multisubunit complex of Cas proteins called Cas- 
cade and target it to matching DNA of viruses and plas- 
mids, ultimately leading to its destruction [13]. 

While it is clear that CRISPR/Cas system in E, coli is 
functional [11,14], virus infection in itself appears not to 
lead to system induction (at least under normal condi- 
tions) [15], and physiological conditions under which 
the system is induced yet have to be determined [13]. 
Consequently, functioning of this system has been inves- 
tigated by either artificial overexpression of cas genes 
and CRISPR array from plasmids, or by inhibition of H- 
NS repression of cas and CRISPR promoters [11,12,16]. 
In a recent study, cas genes were overexpressed in E, 
coli, and resulting changes in the levels of pre-crRNAs 
and crRNAs were quantitatively measured [11]. In cells 
with endogenous (uninduced) cas expression, the abun- 
dance of pre-crRNA and individual crRNAs was low, 
below 10 molecules per cell. When CasE was overex- 
pressed, the abundance of crRNAs increased dramatic- 
ally, to about 1000 molecules per cell, while pre-crRNA 
became undetectable. There is, therefore, a large (at least 
two orders of magnitude) increase in abundance of indi- 
vidual crRNAs, accompanied by a much smaller (less 
than tenfold) decrease of pre-crRNA. It remains unclear 
if (and by what model) this strong amplification of 
crRNA upon cas overexpression can be explained. 
Answering this question is a major goal of this paper. 

Furthermore, the experiments discussed above corres- 
pond to measurements where cas genes and CRISPR 



arrays are overexpressed to a fixed level [10-12,16]. On 
the other hand, it is important to explore how changes 
of the relevant parameters affect generation of crRNAs, 
since such understanding can provide important clues 
about the mechanism of the endogenous system induc- 
tion. Finally, the available experiments correspond to 
steady-state measurements of transcript amounts, 
i.e. come from measurements taken long after cas genes 
overexpression has been induced. However, the steady- 
state regime may not be directly relevant for system 
function under natural conditions, where the amount of 
generated crRNA immediately after system induction 
(i.e., for example, after virus infection) may be more 
relevant. While it is hard to experimentally assess either 
different levels of parameter changes or kinetics of the 
transcript accumulation, this analysis can be readily 
done through mathematical modeling, which is another 
major goal of this paper. 

We will in this paper present a simple mathematical 
model of CRISPR expression that is able to i) determine 
biochemical parameters relevant for CRISPR transcript 
processing, ii) explain the observed large amplification 
of crRNAs, Hi) assess how different levels of change in 
the transcription and processing rates affect steady-state 
levels and kinetics of crRNA accumulation. 

Results 

Model definition 

In this section, we will propose a simple model of 
CRISPR transcript processing. The model is in accord- 
ance with the following experimental observations: 

i) Endogenous (uninduced) levels of pre-crRNAs and 
crRNAs are low (-10 copies per cell) [11,12,16], 
which was reported to be a consequence of 
repression of cas and (to a smaller extent) CRISPR 
promoters by H-NS [10]. 

ii) One of the Cas proteins (CasE) is responsible for 
processing pre-crRNAs to crRNAs [11,12]. When CasE 
is overexpressed, the amount of crRNAs increases for 
about two orders of magnitude, while the amount of 
pre-crRNAs drops to only few transcripts per cell [11]. 
Overexpression of CasE affects only the processing rate 
of pre-crRNA to crRNA, since it has been shown [11] 
that CasE does not influence either pre-crRNA 
transcription rate or crRNA stability. 

iii) In addition to being processed by CasE, pre-crRNA 
is also degraded by an unspecified nuclease [10,11]. 
As a consequence of this degradation, pre-crRNA 
decays with a half-life of ~1 min without generating 
crRNAs. On the other hand, crRNAs are observed 
to be much more stable [11]. 

iv) It is currently unclear how CRISPR/Cas system is 
induced under natural conditions [13]. It was, 
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however, showed that the repression of the cas 
promoter by H-NS can be relieved by a 
transcription activator (LeuO) [16]. It was 
consequently proposed that the endogenous system 
induction may involve activation of cas and (to a 
smaller extent) CRISPR promoters, through 
abolishment of H-NS repression [10]. 

The simplest model of CRISPR transcript processing, 
which is in accordance with the experimental observa- 
tions summarized above, is schematically shown in Fig- 
ure 1. In the scheme, we denote concentrations of the 
unprocessed (pre-crRNA) and processed (crRNA) tran- 
scripts as, respectively, [u] and [p]. The unprocessed 
transcripts (pre-crRNAs) are transcribed with rate 0; 
pre-crRNAs are further either non-specifically degraded 
with rate Au, or processed by CasE with rate Ic By non- 
specific degradation, we mean degradation that does not 
lead to accumulation of crRNA. Processing of pre- 
crRNA by CasE leads to formation of individual crRNAs, 
which are further degraded with rate Ap. Based on the 
experimental results [11], we take A^-l min'^, Ap~l/ 
100 min'\ and [u] [p] 10. 

While the uninduced values of pre-crRNA transcription 
and processing rates (0 and k) have not been experimen- 
tally measured, they can be determined from equations 
that describe kinetics of the system in Figure 1 (see the 
next section). When the system is induced, both k and 0 
can be increased. Since CasE is solely responsible for pro- 
cessing of pre-crRNA to crRNA, the value of the proces- 
sing rate k depends on the amount of CasE. Consequently, 
the increase of k is due to increased amount of CasE, 
which is a consequence of a larger transcription activity of 
cas promoters. Similarly, 0 can be increased if the CRISPR 
promoter becomes more active. 

In the next subsection, we will show that the simple 
model, schematically shown in Figure 1, together with 



experimentally inferred parameter values summarized 
above, can indeed explain the observed large crRNA 
amplification upon induction of cas gene expression. We 
will afterwards explore kinetics of crRNA generation, 
and investigate how modulation of pre-crRNA transcrip- 
tion and processing rate (0 and k) affects generated 
crRNA amounts. 

Uninduced system parameters 

Starting from equations that describe the system kinetics 
(see Methods), it is straightforward to obtain expressions 
for uninduced values of pre-crRNA transcription and 
processing rates (0 and k): 

(p=Xu[u] +Xp\p], (0.1) 



In the equations above [u] and [p] are, respectively, 
(uninduced) steady state amounts of pre-crRNA and 
crRNA, while A^ and Ap are defined in Figure 1. 

By using the numerical values stated in the previous 
section, from Eq. (0.1) we obtain 0~lOAu~lO min'^. This 
value corresponds to a moderately strong transcription 
activity; note that transcription activity of very strong 
rRNA promoters is -60 min'^, while basal activity of a 
very weak uninduced Xprm promoter is -1/7 min'^ [17]. 
It is interesting that in experimental studies the CRISPR 
promoter was labeled as weak, based on measured small 
amount of pre-crRNA [10,11]. The small amount of pre- 
crRNA is actually a consequence of a high non-specific 
decay rate of pre-crRNA (note that pre-crRNA half life 
is ~1 min), which has to be matched by the relatively 
high activity of the CRISPR promoter. The moderately 
high transcription rate of the CRISPR promoter implies 
a weak repression of this promoter by H-NS, which is 
consistent with the experimental finding that repression 
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Figure 1 Model of CRISPR transcript processing. The unprocessed transcripts (pre-crRNAs) are generated with rate (p, and are consequently 
either (non-specifically) degraded with rate X^, or processed to crRNAs by CasE with rate k. crRNAs are then degraded with rate Xp. Concentrations 
of pre-crRNAs (unprocessed transcripts) and crRNAs (processed transcripts) are denoted as, respectively, [u] and [p]. 



Djordjevic et al. Biology Direct 2012, 7:24 
http://www.biology-direct.conn/content/7/1/24 



Page 4 of 1 1 



of the CRISPR promoter by H-NS is significantly weaker 
compared to the repression of the cas promoter [10,16]. 

Similarly, by using numerical values from the previous 
subsection and Eq. (0.2), we obtain /c~Ap~ 1/100 min'^. 
Therefore, pre-crRNA to crRNA processing rate (k) is 
an order of magnitude smaller than pre-crRNA decay 
rate (Au). Due to this, when the system is uninduced, al- 
most all generated pre-crRNA is rapidly degraded (see 
Figure 1), which results in small crRNA amounts, des- 
pite the moderately high transcription rate (0) of the 
uninduced promoter. As we will show in the next sub- 
section, when the system is induced and k is increased, 
the system switches from the state in which almost all of 
the generated pre-crRNA is degraded, to the state in 
which most of the generated pre-crRNA is processed to 
crRNA. 

Overexpression of cas genes 

We next analyze the experiments in which CasE is over- 
expressed, and the transcript numbers are quantified 
[11]. In these experiments, the number of pre-crRNA 
and crRNA transcripts has been measured both before 
and after the system induction. In the analysis below, we 
assume that overexpression of CasE leads to an increase 
of pre-crRNA to crRNA processing rate from k to /c', 
while it has been experimentally shown that the rest of 
the parameters remain unchanged (see above). Further- 
more, we denote pre-crRNA and crRNA amounts upon 
CasE overexpression as, respectively, [u]' and [p]\ Note 
that primes in our notation correspond to the quantity 
values after the system induction, rather than to 
derivatives. 

We aim to understand the large amplification of 
crRNA, where, upon CasE overexpression, a decrease 
from about -10 pre-crRNA transcripts present in unin- 
duced cells leads to about two orders of magnitude in- 
crease in the amount of crRNA (-1000 transcripts). To 
that end, it is useful to derive a relationship between the 
changes in the number of crRNAs (A [/?] = [/?]'-[/?]) and 
pre-crRNAs {A[u] = [i/]'-[w]). By using the equations for 
the system kinetics (see Methods), one can derive the 
following (exact) relation: 

A\p] = -^A[u] (0.3) 

Note that the minus sign indicates that the decrease in 
the number of unprocessed transcripts (pre-crRNA), 
leads to an increase in the number of processed tran- 
scripts (crRNA). From the relationship above follows 
that the crRNA increase is directly proportional to the 
pre-crRNA decrease, where the constant of proportion- 
ality is equal to 100 {\J\p ~ 100 - see the previous sec- 
tion). This large constant of proportionality in Eq. (0.3) 



explains the experimentally observed large amplification 
of crRNA upon CasE overexpression. That is, according 
to Eq. (0.3), -10 molecule decrease in pre-crRNA 
{A[u] ~ 10), leads to two orders of magnitude larger in- 
crease in crRNA {A[p] ~ 1000), as observed in the 
experiments. Therefore, Eq. (0.3) shows that the system 
acts as a strong linear amplifier, where the increase of 
crRNA is directly proportional to the decrease of pre- 
crRNA, and where a small number of pre-crRNAs are 
amplified to a large number of crRNAs. 

Experiments also report that, upon Cas overexpres- 
sion, the amount of pre-crRNA decreases for about one 
order of magnitude, which allows estimating the extent 
of increase of pre-crRNA to crRNA processing rate (/c). 
From equations that describe the system kinetics (see 
Methods), it is straightforward to show that the relative 
decrease of pre-crRNA amount is given by 



It is experimentally observed that [u\l[u\' 10, so from 
Eq. (0.4) follows 10(1^^ + /c). Since we obtained that 
k^Xuy it follows that k' - lOX^, i.e. due to the overexpres- 
sion of CasE, the processing rate becomes for an order 
of magnitude larger than pre-crRNA decay rate. There- 
fore, the overexpression of CasE makes the system 
switch from the state in which almost all of the gener- 
ated pre-crRNA is degraded, to the state where most of 
the generated pre-crRNA is processed to crRNA. 

We will below use the values of the system parameters 
that were estimated above, in order to numerically inves- 
tigate kinetics of the transcript accumulation. To investi- 
gate the kinetics, we will simulate the system both 
deterministically and stochastically; we perform the sto- 
chastic simulations since the number of uninduced pre- 
crRNA and crRNA molecules are small, and since the 
number of pre-crRNA molecules becomes even smaller 
as CasE is overexpressed. However, we will see in the 
subsequent figures that the stochastic and deterministic 
results are in agreement with each other, which validates 
that the simple analytic expressions that we derive 
(e.g. Eq. (0.3)) can be used to describe the system. 

We first numerically investigate how the amount of 
unprocessed and processed transcripts change as k is 
increased (i.e. as CasE is overexpressed). Stochastic 
simulations are performed by using Gillespie stochastic 
simulation algorithm [18], and stochastic trajectories are 
shown together with the deterministic curves. Figure 2A 
corresponds to the uninduced system, where the unin- 
duced system parameters (see the previous section) lead 
to the experimentally observed steady state values 
{[u] -[^] -10). In Figure 2B, we increase the value of k 
1000 times; note that this k increase corresponds to 
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Figure 2 Increase of pre-crRNA processing rate. The first and the second row in the panel correspond, respectively, to the number of pre- 
crRNA and crRNA molecules. The first, the second, and the third column correspond, respectively, to A), B) and C). The deterministic simulation 
corresponds to the magenta dashed line, while ten simulated stochastic trajectories correspond to the full blue curves. The parameter values are 
as experimentally measured, or as inferred from the measurements by the analysis: = 1 min"\ Xp = l/100min"\ l/100min"\ 
cp = 10min"\ [u] = [p] = 10. The system is induced so that (p remains constant, while pre-crRNA processing rate (/c): A) remains the 
same as the uninduced value, B) increases for three orders of magnitude (as in CasE overexpression experiments in [11]), C) increases 
for an additional order of magnitude relative to B). The figure shows that CasE overexpression can lead to a large generation of crRNA, 
but that increase of CasE above some value does not lead to an additional increase of crRNA amount (the saturation of crRNA). 



CasE overexpression in [11] (see above). We see that for 
such k increase [u] drops to a very small amount (few 
transcripts per cell), while [p] increases for about two 
orders of magnitude, consistently with the experimental 
observations. In Figure 2C, k is increased for an add- 
itional order of magnitude (i.e. 10000 fold relative to the 
uninduced value). This additional increase of k leads to 
even smaller amount of pre-crRNA, while the amount of 
crRNA increases for an additional small value (see the 
discussion below). 

The results in Figures 2B and 2C clearly support Eq. 
(0.3). That is, in both of the panels, the steady-state 
amount of pre-crRNA decreases to very small levels 
(Au 10), which leads to about two orders of magni- 
tude increase of steady-state crRNA amount (Ap^lOOO), 
Note that this is in accordance with Eq. (0.3), given that 
the constant of proportionality between Au and Ap 



equals 100 (1^^/1^ = 100). Furthermore, both the de- 
crease of Auy and the increase of Ap, are somewhat lar- 
ger in Figure 2C compared to Figure 2B, which is again 
consistent with the direct proportionality in Eq. (0.3). 
Therefore, both analytical and numerical results show 
that small pre-crRNA decrease leads to a large crRNA 
increase upon CasE over-overexpression. Interestingly, 
this strong amplification crucially depends on loss of 
pre-crRNA through fast non-specific degradation, i.e. on 
large A^^/A^ratio (see Eq. (0.3)). 

Furthermore, we note that the increase of k for one 
order of magnitude between Figures 2B and 2C, leads to 
only small additional increase of crRNA (relative to the 
one in Figure 2B), which we further refer to as saturation 
of crRNA upon increase of pre-crRNA processing rate. To 
additionally investigate this saturation, in Figure 3A we 
systematically predict the effect of k increase on 
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unprocessed (pre-crRNA) and processed (crRNA) tran- 
script amounts. We see that, as k is increased beyond 
1000 fold, the amounts of both pre-crRNA and crRNA 
reach saturation; i.e. pre-crRNA and crRNA amounts do 
not significantly change with further increase of k. The 
saturation value of crRNA increase corresponds to -100 
fold. 

To analytically understand the observed saturation of 
crRNA upon k increase, it is straightforward to derive 
(see Methods) the relative increase of crRNA, as pre- 
crRNA processing rate is increased from k to k': 



A\p]^ Xs/k^l 
\p] Xs/k' + 1 



(0.5) 



From the above equation, it follows that as k' becomes 
significantly larger than Xs (i.e. k'>lOXs)y /^[p\l[p\ no 
longer depends on k', Ap/p then reaches saturation, i.e. 
approaches A^/Zc. Since ~ 100 /c, the saturation is 
reached when pre-crRNA processing rate is increased 



for more than 1000 times, as a result of which A 
increases for about two orders of magnitude. 

Finally, in Figure 3B, we investigate in more detail kinet- 
ics of crRNA accumulation. Figure 2 shows that the steady 
state is reached relatively slowly, i.e. -300 min after the 
system induction. However, when a virulent phage infects 
E, colU the cell lysis is typically complete much before 
300 min post- infection; e.g. for the well known E. coli T7 
and T3 phages, the cell lysis starts at -20 min post- 
infection, with complete shot-off of host functions occur- 
ring much earlier [19]. Therefore, the steady state crRNA 
levels are likely not directly relevant for E, coli defense 
against phage infection. Due to this, in Figure 3B, we esti- 
mate crRNA levels at 20 min after the system induction. 
We see that, similarly to Figure 3A, as k is increased more 
than 1000 fold, crRNA amount at 20 min reaches satur- 
ation. While these saturation levels (-200 transcripts) are 
significantly smaller compared to the steady state values, 
they are still much larger than crRNA levels at which a 
partial protection against phage infection is observed (-10 
crRNA transcripts as per [11]). Therefore, activation of 
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Figure 3 Kinetics of crRNA accumulation. The figure shows how pre-crRNA (the first row) and crRNA (the second row) amounts change as 
pre-crRNA processing rate (/c) is increased. CRISPR transcription rate remains constant and has the same value as in Figure 2. The first and the 
second column correspond, respectively, to A) equilibrium transcript amounts and B) transcript amounts at 20 min post-induction. The horizontal 
axes in the figure correspond to k in multiples of A^, where k changes from the uninduced value (Au/lOO) to a very high value (lOOOAJ. The 
points on the horizontal axes are, for clearer presentation, plotted equidistantly, and correspond to k (in multiples of A J values of: (1/100, 1/50, 1/ 
10, 1, 10, 50, 100, 500, 1000). The magenta line and the blue triangles correspond, respectively, to the stochastic and the deterministic simulations. 
The figure confirms the saturation effects observed in Figure 3, and suggests that the system is able to generate substantial crRNA amounts soon 
after its induction. 
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cas expression leads to a rapid accumulation of crRNA, 
which suggests such activation can lead to an effective 
protection against phage infection. 

Joint overexpression of CRISPR and cas genes 

We next consider what happens if transcription of both 
cas genes and CRISPR array is activated. This analysis is, 
in part, motivated by reported repression of cas and (to 
a smaller extent) CRISPR promoters by H-NS, and by a 
model which proposes that the system is induced by 
abolishing this repression [10]. Activation of cas genes 
and CRISPR array transcription leads to increasing of 
both pre-crRNA processing rate (we assume that k 
increases to k) and CRISPR transcription rate (we as- 
sume that 0 increases to 0'). It is straightforward to de- 
rive (see Methods) that upon increase of both 0 and /c, 
the amount of generated crRNA is given by: 



A\p]^ (y/c + i) 0^ 
\p] {Xs/k^ + 1) 0 ■ 



(0.6) 



From Eq. (0.6), we see that relative increase in crRNA 
depends linearly on relative increase of CRISPR tran- 
scription rate (070). From this follows that the satur- 
ation in crRNA due to increase of only /c, which was 
discussed in the previous subsection, can be relieved if 0 
is increased as well. 

Increase of crRNA due to joint increase of k and 0 is 
numerically investigated in Figure 4. In this figure, k is 
increased for the amount that corresponds to the satur- 
ation (see the previous subsection), while 0 is increased 
tenfold. Note that the tenfold increase in 0 approaches 
maximal biochemically realistic value, since the basal 0 
value is already moderately high (-10 min'^), while the 
transcription rate of very strong rRNA promoters is for 
about one order of magnitude higher [17]. We see that 
such induction strategy leads to an even higher increase in 
the amount of generated steady-state crRNA (-10^ fold 
relative increase of crRNA upon induction); similarly, the 
amount of generated crRNA soon after the induction (e.g. 
at 20 min post-induction) - which may be relevant for 
defense against bacteriophages - is much higher than the 
minimal crRNA amount (-10 transcripts) necessary for 
partial protection against viruses [11]. 

In Figure 4, CRISPR transcription was increased in order 
to relieve the saturation due to increase of only k (compare 
with Figure 2C), and 0 was increased for a maximal bio- 
chemically realistic value. Consequently, crRNA amount 
in Figure 4 roughly corresponds to the maximal value that 
can be generated by the system. On the other hand, an in- 
crease in CRISPR transcription can be also used to sub- 
stantially reduce the increase in pre-crRNA processing 
rate, while still achieving the same increase in generated 
crRNAs. This possibility is explored in Figure 5, which is. 




100 200 

Time (min) 

Figure 4 Joint increase of Ic and (p. The figure shows how pre- 
crRNA (the first row) and crRNA (the second row) change as k is 
increased for three orders of magnitude (the saturation value - see 
Figure 2), while (p is increased for one order of magnitude. The initial 
conditions and pre-crRNA and crRNA decay rates (Au and Ap) are the 
same as in Figure 2. The figure shows that saturation in crRNA 
amounts (due to increase of only k) can be relieved if (p is increased 
as well, which leads to a very large amount of generated crRNA. 



in part, motivated by experiments in which H-NS repres- 
sion of cas and CRISPR promoters is abolished [10,16]. 
Upon this abolishment, the amount of crRNA is increased 
for about two orders of magnitude, i.e. for the similar value 
as in CasE overexpression experiments [11,16]. 

Figure 5 demonstrates that the two orders of magnitude 
increase of crRNA can be achieved through very different 
levels of increase of pre-crRNA processing rate /c, if 
CRISPR transcription rate is allowed to increase as well. 
Accordingly, the three panels in Figure 5, show roughly the 
same (two orders of magnitude) increase in crRNA levels, 
which are achieved in the following way; /) in Figure 5A, k 
is increased for three orders of magnitude, without increase 
of 0, //) in Figure 5B, k is increased for two orders of mag- 
nitude, while 0 is increased two times, in Figure 5C, 
both k and (p are increased for one order of magnitude. 

Figure 5 demonstrates that large amounts of crRNA can 
be generated without a large CasE overexpression - which 
is characteristic for the (artificial) overexpression experi- 
ments - as long as CRISPR array transcription is increased 
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time (min) 

Figure 5 Reducing the increase of k through increase of cp. The figure shows increase of crRNA as A) pre-crRNA processing rate (/c) is 
increased 1000 fold relative to the uninduced value, while CRISPR transcription rate (cp) is kept unchanged, B) k is increased 100 fold, while (p is 
increased twofold, C) both k and cp are increased 10 fold. The figure shows that a moderate increase in cp allows to substantially reduce the 
increase in k, while still achieving the same increase in crRNA amount. 



for a much smaller amount. While conditions of natural 
CRISPR/Cas induction are currently unclear [13], it is 
likely that activation of the CRISPR array promoter is 
much weaker compared to the activation of the cas pro- 
moter (note that repression of the cas promoter by H-NS 
was found to be significantly stronger than repression of 
the CRISPR promoter) [10,16]. This, therefore, suggests 
that conditions of natural system induction might roughly 
correspond to Figure 5B (the increase of k that is much 
larger than the increase of 0). 

Discussion 

We here proposed a simple model of CRISPR transcript 
processing. We used this model, together with previous 
experimental measurements, to infer all the parameters 
that characterize the uninduced system. We showed that 
our model can explain the experimental observation that 
CasE-dependent decrease of very low initial steady-state 
level of E. coli pre-crRNA leads to a very large increase 
of crRNA abundance. Interestingly, this observation is a 
direct consequence of fast non-specific (i.e., not leading 
to crRNA) degradation of pre-crRNA. Our results, 
therefore, strongly suggest that non-specific degradation 
by an yet unidentified nuclease is a major control elem- 
ent of CRISPR expression and CRISPR/Cas response. 

It is interesting to note that while effects of activation of 
cas gene transcription on CRISPR/Cas system were exten- 
sively studied, there is a lack of such studies for activation 
of CRISPR array transcription. Specifically, changes of 
pre-crRNA and crRNA amounts were quantitated only for 
cas gene overexpression, but not for CRISPR array overex- 
pression [11]. Furthermore, while effects of cas gene over- 
expression on host protection against phage infection 
were measured [11], there is no such analysis for CRISPR 
array overexpression. That is, while in [12] it was shown 
that joint overexpression of cas genes and CRISPR array 



leads to efficient protection against bacteriophage infec- 
tion, it is unclear what additional protection is provided 
due to CRISPR array overepression. Finally, activation by 
LeuO (a transcription regulator that abolishes H-NS re- 
pression) was studied for the cas promoter [16], but 
remains to be investigated for the CRISPR promoter. 

Contrary to the almost complete emphasis on activation 
of cas gene transcription, the results presented here indi- 
cate that activation of CRISPR array transcription may be 
an important mechanism of CRISPR/Cas response. That 
is, we showed that there is a saturation of generated 
crRNA upon overexpression of only cas genes, i.e. that the 
amount of crRNA stops to increase when the rate of pre- 
crRNA processing is increased above certain level. This 
saturation is relieved when the rate of CRISPR transcrip- 
tion is increased as well, and we showed that a joint in- 
crease in transcription rates of cas and CRISPR promoters 
can lead to a very large (three orders of magnitude) in- 
crease of steady state crRNA levels. We, moreover, 
obtained that a substantial amount of crRNAs can be gen- 
erated soon after the system induction, which suggests 
that the system may be capable for efficient protection 
against viruses under natural conditions. Unlike the situ- 
ation observed in other bacteria, E, coli CRISPR spacers 
for the most part do no match sequences in known phages 
or plasmids. Yet, numerous data show that E, coli 
CRISPR/Cas system is functional once appropriate spacers 
are introduced by means of genetic engineering [12,20]. 
Presumably, the mechanism of CRISPR transcript proces- 
sing, which was analyzed here, is relevant for protection 
against E, coli phages that are yet to be identified [21]. 

As a further support of potential importance of 
CRISPR array regulation, we showed that a modest in- 
crease of CRISPR transcription rate can substantially de- 
crease for how much pre-crRNA processing rate needs 
to increase in order to achieve a desired crRNA amount. 
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For example, as small as twofold increase of CRISPR 
transcription rate allows reducing for one order of mag- 
nitude the pre-crRNA processing rate needed to achieve 
the two orders of magnitude increase of crRNAs (the in- 
crease observed when H-NS repression is abolished). 
Since repression of cas promoters by H-NS was found 
to be significantly stronger than the repression of 
CRISPR promoters, the regime in which the increase of 
pre-crRNA processing rate is significantly larger com- 
pared to the increase of CRISPR transcription rate may 
be directly relevant for natural system induction. 

Conclusions 

We here developed a simple model of CRISPR transcript 
processing, and showed that this model is able to explain 
the existing experimental observations. The model shows 
that the relationship between the relevant biochemical 
quantities can be viewed as strong linear amplification, 
where this effect is a consequence of fast non-specific deg- 
radation of pre-crRNA. This implicates that the unidenti- 
fied nuclease, which is responsible for the non-specific 
degradation, is a major control element of CRISPR/Cas re- 
sponse. We furthermore pointed to the potential import- 
ance of regulation of CRISPR array transcription, which 
may be another important mechanism of CRISPR/Cas 
system induction. Elucidating how the system is induced 
under natural conditions remains a major question to be 
addressed by both experimental and theoretical research. 

Methods 

Overexpression of cas genes 

Kinetic equations that describe generation, degradation 
and processing of CRISPR transcripts (see Figure 1) are 
given below: 



d[u\ 
dt 

d\p\ 
dt 



■■(p-Xu[u]- k[u] 
-Xp\p] +k[u\ 



(0.7) 



(0.8) 



Notation used in the above equations is described in 
Results and Figure 1. In the steady state d[u\/dt = 0 and 
d\p]/dt = 0, so: 

0 = (p-Xu[u]-k[u] (0.9) 

0 = -Xp\p]^k[u] (0.10) 

Upon CasE overexpression, the new steady state becomes: 

0 = (p-Xu[u\' -k'[u\' (0.11) 

0 = -Xp\p]' + k'[u]' (0.12) 

In the above equations, note that upon CasE overex- 
pression, CRISPR transcription rate 0 and crRNA stability 



Xp do not change [11], while pre-crRNA processing rate 
increases to k\ 

We next subtract Eq. (0.10) from Eq. (0.9) and sub- 
tract Eq. (0.12) from Eq. (0.11). We then again subtract 
these two expressions to obtain: 

X,{[u]-[u]')-X,{\p]'-\p])=0 

In the above expression, [u]' — [u] is the change of pre- 
crRNA amount upon CasE overexpression, which we 
label as A [u] . Similarly, we label the change of crRNA as 
A\p\ = — \p\ We therefore have: 



A\p\ = -XulXpA[u] 



(0.13) 



Furthermore, to calculate [u]l[u{ , we express [u] from 
Eq. (0.9) and [u]' ivom Eq. (0.11) to obtain: 



[u] _ Xu^ k' 



10 



(0.14) 



Finally, we can solve for [p]' from Eqs. (0.11) and 
(0.12), and for [p] from Eqs. (0.9)and (0.10) to obtain: 



A\p]^ Xs/k + l 
\p] Xs/k' + 1 



1 



(0.15) 



Joint increase of cas and CRISPR transcription 

When transcription of both CRISPR and cas genes is 
increased, we assume that CRISPR transcription rate 
increases from 0 to (p\ while pre-crRNA processing rate 
increases from k to k\ Then Eq. (0.11) and Eq. (0.12) be- 
come: 



0 = (p' -Xu[u\ -k'[u] 
0 = -Xp\p^ ^k'[u^ 



(0.16) 
(0.17) 



After expressing [p]' from Eqs. (0.16) and (0.17), and 
\p\ from Eqs. (0.9) and (0.10) we obtain: 



A\p] ^ Xs/k^l cp' ^ 
\p] Xslk' + 1 0 



(0.18) 
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Abstract, Background. Claiming that CRISPR repeats are palindromic is an 
overstatement. Firstly, the repeats are only approximately palindromic. 
Secondly, not all repeats have palindromic structure. 
Authors' response 

We now reworded the abstract so that the term palindromic no longer 

appears. This term was originally used in reference to CRISPR acronym 

(Clustered Regularly Interspaced Short Palindromic Repeats). 

Abstract, Conclusions: "the extent of s gene activation" - it seems, "s" is all 

that was left from "casE". 

Authors' response 

This is now corrected. 

The half-life sometimes is measured in "min", and sometimes, in "1/min". 
Authors' response: This was a typo which we now corrected (the half-lives 
are now consistently measured in "min", while the decay rates are measured 
in "1/min". 

Results, Model definition. Points (i) and (iv) partially repeat each other. 
Authors' response 

We now modified these two points so that the redundancy is removed. 
Results, Uninduced system parameters: it would be better to state at the 
very beginning that k depends on the concentration of CasE. 
Authors' response 

We now included such statement before the beginning of this section (the 
third sentence in the second to last paragraph of the section Results, Model 
definition). 

Results, Overexpression of cas genes, para. 2: "10 transcripts lead to two 
orders of magnitude increase" - is not a clear sentence. 
Authors' response 

The word decrease was missing from the sentence, we now corrected this. 
Using primes (k') is not good notation when kinetics is considered: at the 
first glance might get confused with the derivative. 
Authors' response 

We are aware that using primes is not an ideal choice; however, all 
alternative solutions (e.g. using subscripts) that we could think of are more 
cumbersome, and may potentially lead to confusion as well; note that not 
only k, but also values of all other quantities after the system induction are 
marked with primes {[u]', [p]', cp'). To prevent possible confusion, we now 
introduced the sentence in the text: "Note that primes in our notation 
correspond to the quantity values after the system induction, rather than to 
derivatives". We furthermore note that all the quantities in the text are 
clearly defined as they appear, which we hope prevents confusion with the 
derivatives. 

Discussion, para. 1: the possibility to use CRISPR sin synthetic circuits seems 
to be an overclaim. 
Authors' response 

We removed this statement, since this topic was indeed not analyzed in the 
paper. 

Reviewer 2 - Dr. Eugen Koonin 

CRISPR-Cas is an extremely intriguing system the regulation of which under 
different conditions remains poorly understood. Therefore, diverse 
approaches to this problem are of interest. Here the authors develop an 
extremely simple mathematical model to describe the effect of over- 
expression of cas genes, and in particular casE, on the production of crRNA. 
The results indicate that, when cas genes are over-expressed, the amount of 
mature crRNA increases proportionate to the decrease in the amount of pre- 
crRNA with a coefficient equal to the ratio of the degradation rates of pre- 
crRNA and crRNA. Because this ratio is large, about 100, there is the 
amplification effect that is the main claim of the article: a small decrease in 
the amount of pre-crRNA results in a large increase in the amount of crRNA. 
In other words, as the authors point out, the system switches from a 'non- 
productive' regime, when almost all pre-crRNA is degraded, to a 'productive' 
regime when almost all pre-crRNA is processed into crRNA. In more specific 
terms, this happens because pre-crRNA is unstable whereas crRNA is stable, 
so excess of Cas proteins prevents degradation of pre-crRNA, hence (nearly) 
all pre-crRNA molecules are channeled into crRNA resulting in the 
'amplification effect'. I believe this scenario is straightforward and valid, and 
the model presented in the manuscript, although obvious, does quantify the 
effect, which makes the article worth the attention of researchers in the 
CRISPR field. 
Authors' response 

We thank Dr. Koonin for positive comments, and we addressed the 
suggestions regarding the paper presentation. 



However, having asserted the above, I also believe that the manuscript is in 
need of serious rewriting. The current version is obscure to the point of 
being misleading. Although it is technically correct to call the CRISPR-Cas 
system a linear amplifier, in reality, I think the very term amplification has a 
major confusing potential. My first thought when reading the title of the 
paper was that this is about actual replication of crRNA. My suggestion 
would be to get rid of 'amplification' altogether and replace it with 
'enhancement of crRNA production' or some such phrase. 
Authors' response 

We appreciate the comment that our statements - which were referring to 
relationships between biochemical quantities - may be confused for literal 
biological mechanisms, especially if a reader is exposed only to the title/ 
abstract. To address this comment, we did the following: i) Removed the 
term "amplification" from the title, as suggested, ii) Rewrote the abstract so 
that it is now clear that "amplification" refers to the derived relationship 
between the relevant biochemical quantities, rather than to a literal 
biological mechanism. In particular, it is now explicitly stated in the abstract: 
". . .The relationship between the decrease of pre-crRNAs and the increase of 
crRNAs corresponds to strong linear amplification. . /' Once the relevant 
explanation is provided, we did not completely remove the term 
amplification from the abstract since, as mentioned by Dr. Koonin, it 
accurately represents the obtained relationship between the relevant 
quantities, iii) In the main text the term "amplification" first appears in 
Introduction, but this is immediately after the sentence which makes clear 
that we are referring to the relationship between the relevant quantities. 
Given this explanation we kept the term amplification further in the text. 
"The model shows that the transcript processing corresponds to strong 
linear amplification of pre-crRNA. The strong amplification is due to fast non- 
specific degradation by an unspecified nuclease, suggesting that this 
nuclease is a major control element of CRISPR response". 
The first sentence in this quote is simply not understandable. The second 
one creates the impression that the uncharacterized nuclease somehow 
directly promotes that 'amplification'. The reality is that degradation of pre- 
crRNA as such leads to low level of crRNA production, just as one would 
expect; it is the prevention of the said degradation that produces the 
observed effect. There is much more such language in the current 
manuscript, so I think it should be carefully re-read and edited/rewritten. 
Authors' response 

Actually, the meaning of these statements was that the 'amplification' 
directly depends on to the ratio of the decay rates of pre-crRNA and crRNA 
(as mentioned above by Dr. Koonin). Since the large decay rate of pre-crRNA 
is due to fast non-specific processing by an unidentified nuclease, this 
nuclease is likely a major control element of CRISPR response. However, 
similarly as above, we appreciate that the statements can be confused with 
the endonuclease being directly physically involved in crRNA amplification. 
We consequently removed/rewrote the two sentences, and the term 
"amplification" is now completely absent from Abstract-Conclusion. We also 
rewrote the relevant sentences in the main text along the same lines, so as 
to avoid possible confusion. 
Reviewer 3 - Dr. L. Aravind 

The large amplification of the crRNAs with the concomitant decline of pre- 
crRNAs to near undetectability upon CasE overexpression is an interesting 
factor in the action of the CRISPR/Cas system. The authors attempt to 
explain this with a mathematical model and also suggest that their model 
might be useful to understand endogenous CRISPR processing. 
One useful point the authors show is that although the promoter of the 
CRISPR locus in E.coli has been described as weak, it is mainly a 
consequence of degradation of pre-crRNA rather its low transcription. Then 
using their model they explain quite clearly how the observed amplification 
can be accounted for. Their model also shows large increase in number of 
crRNAs is attained at distinct increased rates of pre-crRNA processing when 
combined with CRISPR transcription rate increase. At least some of these 
appear to provide reasonable pairs of values which might resemble 
physiological induction. 
Authors' response 

We thank Dr. L. Aravind for positive comments, and we have addressed the 
suggestions bellow. 

While the model is well-explained and appears to account for the 
observations published earlier by the authors, a question remains regarding 
its actual significance for the E.coli CRISPR/Cas system: The system does not 
appear to be induced at by PRDl or lambda for that matter. Further the 



Djordjevic et al. Biology Direct 2012, 7:24 
http://www.biology-direct.conn/content/7/1/24 



Page 11 of 1 1 



endogenous CRISPR sequences of E.coli do not seem to match any 
apparently prevalent plasmids of viruses. The authors state: "We, moreover, 
obtained that a substantial amount of crRNAs can be generated soon after 
the system induction, which suggests that the system may be capable for 
efficient protection against viruses under natural conditions." While in 
principle this is correct, the actual situation seems to be different. If the Cas 
and CRISPR genes are not induced upon phage infection in E.coli then the 
high crRNA levels shortly after induction becomes moot. So it might be 
useful for the authors to stress that this aspect of their model is relevant only 
if the CRISPR/Cas genes are induced as they propose by some hypothetical 
phage (i.e. as yet unknown E.coli invasive DNA), even though this has not 
been actually observed in E.coli with any of the currently studied phages. 
Authors' response 

To address this comment we added the last three sentences in the second 
to the last paragraph of the Discussion. As discussed by Dr. Aravind, E coil 
CRISPR spacers indeed do not match sequences of known E coil phages or 
plasmids. However, numerous data show that once appropriate spacers are 
introduced in CRISPR array by means of genetic engineering, E. coil CRISPR/ 
Cas system becomes functional. Consequently, it is currently mostly assumed 
that CRISPR/Cas system should indeed be induced by invasive DNA (see e.g. 
a model summarized by Figure 7 in [10]), though exact physiological 
conditions under which such induction occurs have yet to be understood. 
Furthermore, it is estimated that a very large proportion (in fact, most) of E 
coll phages are not yet known [21], which may provide a reason for the 
absence of a match between E. coll CRISPR spacers and sequences of known 
phages. Finally, our prediction that the system can generate high crRNA 
levels shortly after its induction might prove to be relevant even if some 
(currently) unexpected function of CRISPR/Cas system in £ coll emerges. 
Minor issues: 

Check spellings/unclear esxpressions: 
"archeaeal" 

"the processing rate becomes for an order of magnitude larger than 
the transcription decay rate" 

It appears that the authors are talking about unprocessed RNA decay 
rate. 

Brackets might help presentation of equation 1.6. 

"observed large generation of crRNAs from only few pre-crRNAs" 

observed amplification. . . 

Authors' response 

We corrected/implemented all the suggested changes in the text. 
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